-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mv: gnu test case part-hardlink
fix
#6632
base: main
Are you sure you want to change the base?
Conversation
GNU testsuite comparison:
|
Bravo
|
f32325e
to
1da64ed
Compare
GNU testsuite comparison:
|
1 similar comment
GNU testsuite comparison:
|
1da64ed
to
2c7bee1
Compare
GNU testsuite comparison:
|
GNU testsuite comparison:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for tackling these issues! However, combining these three behavioral changes seems like a bad idea, given that each of them have delicate consequences.
It seems that get_dir_content
is the entirely wrong tool to approach mv
, and it causes incorrect behavior.
Finally, it seems you fell into a common trap / bad pattern: Guessing a potential error condition (and not detecting others), then doing an action (and badly handling most error conditions.) This causes unhelpful error messages and unnecessary syscalls.
src/uu/mv/src/mv.rs
Outdated
} | ||
// We fall back to copying and then removing. | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the comment into the else
clause. Currently, it looks misleading, as if the if
is now complete.
src/uu/mv/src/mv.rs
Outdated
if !from.exists() { | ||
if let Some(msg) = from.to_str() { | ||
let msg = format!("Path \"{}\" does not exist or you don't have access!", msg); | ||
return Err(FsXError::new(FsXErrorKind::NotFound, &msg)); | ||
} | ||
return Err(FsXError::new( | ||
FsXErrorKind::NotFound, | ||
"Path does not exist or you don't have access!", | ||
)); | ||
} | ||
|
||
let dir_content = get_dir_content(from)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This misses some error conditions, which leads to the following confusing behavior, both on main and on this PR:
$ rm -rf /tmp/foobar/ && cargo run -q --features mv mv -v /root/ /tmp/foobar
mv: Permission denied
[$? = 1]
Which permission was denied? When accessing which file? Is the permission problem with the source or with the destination? I hope you can see how this error message might be confusing when there's many files being moved/copied.
GNU offers a much more user-friendly error message:
$ rm -rf /tmp/foobar/ && mv -v /root/ /tmp/foobar
created directory '/tmp/foobar'
mv: cannot access '/root/': Permission denied
[$? = 1]
The same holds with the error message on line 770: It doesn't even mention the filename.
Please instead handle errors on the actions that cause them. In this case, handling the error when it happens (instead of trying to guess) completely eliminates the need for the from.exists()
call.
src/uu/mv/src/mv.rs
Outdated
if let Some(file_name) = from.file_name().and_then(|file_name| file_name.to_str()) { | ||
file_name.to_string() | ||
} else { | ||
return Err(FsXError::new( | ||
FsXErrorKind::InvalidFileName, | ||
"Invalid file name", | ||
)); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File names are not necessarily valid UTF-8. This code would raise an error if such a file name is encountered. Does that mean non-UTF8-filenames cause an error when enabling progress bars?
We already don't have good handling for non-UTF-8 filenames, but let's try to avoid making the situation worse.
tests/by-util/test_mv.rs
Outdated
|
||
// Ensure that the folder structure is preserved, files are copied, and their contents remain intact. | ||
assert_eq!( | ||
read_to_string(other_fs_tempdir.path().join("a/b/d"),).expect("Unable to read other_fs_file"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a spurious comma?
read_to_string(other_fs_tempdir.path().join("a/b/d"),).expect("Unable to read other_fs_file"), | |
read_to_string(other_fs_tempdir.path().join("a/b/d")).expect("Unable to read other_fs_file"), |
tests/by-util/test_mv.rs
Outdated
"d" | ||
); | ||
assert_eq!( | ||
read_to_string(other_fs_tempdir.path().join("a/b/c/e"),).expect("Unable to read other_fs_file"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
src/uu/mv/src/mv.rs
Outdated
let result_file_move = if let Some(new_source) = moved_files.get(file_info) { | ||
fs::hard_link(new_source, to)?; | ||
Ok(file_info.file_size()) | ||
} else if let Some(progress_bar) = progress_bar { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you write code for the progress-bar case, then please add a test that exercises the progress-bar case.
src/uu/mv/src/mv.rs
Outdated
)); | ||
} | ||
|
||
let dir_content = get_dir_content(from)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_dir_content
does not return sufficient information. GNU mv copies/moves as many files/directories as it can, even if it first encounter some files where it has insufficient permissions:
$ mv -v source/ /tmp
created directory '/tmp/source'
created directory '/tmp/source/d1'
mv: cannot access 'source/d1': Permission denied
created directory '/tmp/source/d3'
mv: cannot access 'source/d3': Permission denied
copied 'source/f2' -> '/tmp/source/f2'
[$? = 1]
$ ls source/ # source has NOT been deleted!
d1 d3 f2
$ ls /tmp/source/
d1 d3 f2
However, get_dir_content
bails completely if there are any permission issues:
$ cargo run -q --features mv mv -v source/ /tmp
mv: cannot move 'source/' to '/tmp/source': Permission denied
[$? = 1]
$ ls /tmp/source/
ls: cannot access '/tmp/source/': No such file or directory
[$? = 2]
And the final nail in the coffin is that get_dir_content
insists to first read the entire filesystem structure, which uses up potentially vast amounts of memory, and gives ample opportunity for all kinds of TOCTOU races: https://docs.rs/fs_extra/latest/fs_extra/dir/struct.DirContent.html
Therefore, we should avoid using get_dir_content
entirely. I'm not even sure it can be used correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. Instead of using get_dir_content
, we could follow the approach used in uu_cp
, which involves walking through the directory, copying files, and creating folders as we go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I meant to set the "Request changes" flag. See above for my reasoning. In particular, get_dir_content
must go.
May I suggest to split the tree changes into two or three PRs?
4b9e64e
to
666f8cb
Compare
GNU testsuite comparison:
|
666f8cb
to
119ce64
Compare
GNU testsuite comparison:
|
119ce64
to
e8c9ffc
Compare
GNU testsuite comparison:
|
3754a06
to
338ed28
Compare
@BenWiederhake I cleaned up this PR. Now, it focuses solely on the directory copying part. I'll work on verbose output and error reporting next. For now, could you let me know if this approach looks okay? |
GNU testsuite comparison:
|
GNU testsuite comparison:
|
GNU testsuite comparison:
|
GNU testsuite comparison:
|
GNU testsuite comparison:
|
102a27d
to
dd06b70
Compare
GNU testsuite comparison:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In summary, yes, this PR is definitely on the right track, now that it no longer unconditionally buffers all filenames before acting on anything.
There are some small issues and edge cases, but I hope they are reasonably easy to tackle. Please say something if I guessed wrong, or mark specific issues as out-of-scope.
I hope I don't intimidate you by the sheer amount of feedback: This is definitely a good start, and your work is very much appreciated!
@@ -98,7 +104,69 @@ pub enum OverwriteMode { | |||
///'-f' '--force' overwrite without prompt | |||
Force, | |||
} | |||
struct VerboseContext<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit, please avoid in the future: It's uncommon to define structs and impls before constants. But it's correct, compiles, and it kinda makes sense, so it's fine.)
src/uu/mv/src/mv.rs
Outdated
} | ||
|
||
fn create_folder(&self, path: &Path) { | ||
println!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is pb.suspend
necessary for some prints but not others?
src/uu/mv/src/mv.rs
Outdated
rename_with_fallback(from, to, multi_progress, verbose_context)?; | ||
|
||
match multi_progress { | ||
Some(pb) => pb.suspend(|| { | ||
println!("{message}"); | ||
}), | ||
None => println!("{message}"), | ||
}; | ||
} | ||
Ok(()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of
rename_with_fallback(from, to, multi_progress, verbose_context)?;
Ok(())
// end of function
you can usually just write
rename_with_fallback(from, to, multi_progress, verbose_context)
In fact, clippy should have suggested this to you. Did something go wrong in the CI clippy setup? Or did I miss some implicit error conversion?
src/uu/mv/src/mv.rs
Outdated
io::ErrorKind::PermissionDenied, | ||
"Permission denied", | ||
)), | ||
MvError::NotAllFilesMoved => { | ||
Err(io::Error::new(io::ErrorKind::Other, format!(""))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty string? This looks like a mistake, although I must admit that I'm simply guessing.
src/uu/mv/src/mv.rs
Outdated
Some( | ||
ProgressBar::new(dir_size).with_style( | ||
ProgressStyle::with_template( | ||
"{msg}: [{elapsed_precise}] {wide_bar} {bytes:>7}/{total_bytes:7}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any tests that ensure that the values actually make sense? It doesn't have to be all of them, but I don't see any at all that assert specific value(range)s for bytes, total_bytes, etc.
Maybe I missed something?
}; | ||
} | ||
|
||
fn create_folder(&self, path: &Path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Directories are abbreviated as dir
everywhere else. Is a folder something different, or a synonym?
src/uu/mv/src/mv.rs
Outdated
let dir_entry_to = to.join(tmp_to); | ||
if file_type.is_dir() { | ||
// check this create all align with gnu | ||
let res = create_all(&dir_entry_to, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metadata like rwx
flags aren't copied. It's okay if you decide that this has to be done in a different PR, but this seems like an easy fix, maybe?
src/uu/mv/src/mv.rs
Outdated
while !moved_entries.is_empty() { | ||
let (moved_path, file_type) = moved_entries.pop().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: There's a Rust-feature for that. (The compiler probably emits the same code for both.)
while !moved_entries.is_empty() { | |
let (moved_path, file_type) = moved_entries.pop().unwrap(); | |
while let Some((moved_path, file_type)) = moved_entries.pop() { |
error_occurred = true; | ||
show_error!("cannot remove {}: {}", moved_path.quote(), err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check how GNU mv
behaves and add a test:
- Does it abort on the first
rm
error, e.g. if the file is readable but cannot be removed for some reason? - Does it continue removing the rest?
- If a single file in a nested directory cannot get removed, does this result in multiple error messages, or just one?
@BenWiederhake no not at all also thanks for helping me with this 😃 |
GNU testsuite comparison:
|
Co-authored-by: Sylvestre Ledru <sylvestre@debian.org>
17c3019
to
2b7448e
Compare
GNU testsuite comparison:
|
fix for #6631
behaviours changed
no-target-directory
was specified and the destination ended with a/
This patch resolves that issue.